Apparatus and Method for Decoding Low Density Parity Check Coded Signals

ABSTRACT

The disclosed embodiments relate to an apparatus and method for decoding signals in a receiver, such as signals using low density parity check error correction. The apparatus includes a link circuit. The link circuit may include a first memory, a first and second processing block, and also include a first shift circuit for shifting data before entering one of the processing blocks and a second shift circuit for reversing the first shift after exiting the processing block. The link circuit may also include a second memory used for intermediate storage and shared by the first and second processing block. The method includes reading data from a memory, shifting the data prior to processing, processing the data, and then reverse shifting the data prior to writing it back to the memory.

FIELD OF THE INVENTION

The present invention relates generally toward a communicationsreceiver. More specifically, the present invention relates to theprocessing of error correction signals in systems using methods such aslow density parity check coding.

BACKGROUND OF THE INVENTION

This section is intended to introduce the reader to various aspects ofart, which may be related to various aspects of the present inventionthat are described and/or claimed below. This discussion is believed tobe helpful in providing the reader with background information tofacilitate a better understanding of the various aspects of the presentinvention. Accordingly, it should be understood that these statementsare to be read in this light, and not as admissions of prior art.

As most people are aware, satellite television systems have become muchmore widespread over the past few years. In fact, since the introductionof digital satellite television in 1994, more than twelve millionAmerican homes have become satellite TV subscribers. Most of thesesubscribers live in single-family homes where satellite dishes arerelatively easy to install and connect. For example, the satellite dishmay be installed on the roof of the house. In order to continue thisgrowth, the customer often expects more every year from the service. Theservice providers thus are constantly considering new features andupgrades such as recording, multi-room operation, and larger and bettercontent. Recently, more attention has become focused on high definitionvideo and audio signals.

High definition signals require more capacity or bandwidth than theservices currently provided on the satellite system. Also, many highdefinition services are provided in addition to, rather than as areplacement to, the current service. In order to provide these newservices, some service providers are increasing the total capacity oftheir systems. Capacity can be increased in a number of ways includingincreasing the number of transponders or satellite channels available orincreasing the number of satellites used. The largest change to thesatellite system involves changing the actual communications systemspecifications.

Recent advances in technology have allowed satellite service providersto consider increasing capacity by changing the system specifications ina number of ways including using a new decoding algorithm such as onecreated by the Motion Picture Entertainment Group (MPEG) commonly knownas MPEG-4. Additionally, it is possible to utilize a more advancedmodulation format such as eight level phase shift keying (8PSK) found inthe standard created for digital video broadcast (DVB), known as DVB-S2.The DVB-S2 standard also provides for a new error correction systemknown as low density parity check (LDPC) coding, which allows a furtherincrease in overall system capacity. Although these changes can increasecapacity in the communications system, they also change the operatingmargins for receiving the signal and force changes in the receiverdesign.

Almost all communications systems, and particularly digital systems suchas the satellite system described, employ some form of error correctionmethod to improve receiver performance. These schemes can involve verycomplex functions that must be carried out at either the transmitter orreceiver or both. As mentioned, one such scheme becoming prevalent isLDPC coding.

LDPC coding is a method of error correction that creates parity bits forsmall sections of a larger segment of a data stream, and appends theseto the data stream. The data stream segment operated on may be quitelarge, for instance 64,800 bits, and parity bits may be created for muchsmaller sections, such as 3 bits in a group.

Although the processing is performed on segments of the data stream, theincoming data stream remains continuous. The incoming data stream istypically a video or audio signal in a digital form, so the processingmust be completed in a finite amount of time. The time allotted forprocessing the data stream segment is typically referred to as the LDPCframe time. In addition, the parity groups may overlap, where a singlebit may be a member of more than one parity group and therefore may havemore than one parity bit responsible for the data bit's parity. Themethod of assigning these groups and parity bits is typically known andpredetermined in order for the decoder, and in particular the controllerfor the decoder, to properly manage the parity checking process.

LDPC decoding may have several layers of processing complexity. A firstand very simple parity operation can be performed on the data stream nowcontaining the appended parity bits. The simple parity operationinvolves simply performing a conventional parity check based on theparity of the parity group and its associated parity bit. If an error isfound, then it may be corrected based on the results of the parity groupchecking. However, it is possible that some errors remain uncorrectablebecause the exact bit that is in error may still be indeterminable. Inorder to further address performance of error correction, the same databit is often used in more than one parity grouping.

Even with both simple parity and multiple group parity, all errors maystill not be explicitly corrected. Additionally, due to the nature ofthe parity groupings, each bit contains not only information about itsvalue intrinsically, but also extrinsically. The intrinsic informationabout the data bit may be characterized in terms of the actual value ofthe bit, including knowledge gained by performing parity checkoperations using the parity information. Extrinsic information involvesinformation that can be determined about the value of the bit, based onvalues of other data bits in the data stream (e.g., data bits adjacentto the current data bit under process or other bits in the paritygroupings.) Using both intrinsic and extrinsic information requires amore complex correction algorithm that involves using both elements ofinformation in an iterative process in order to ascertain the finalcorrect bit value.

The decoder for LDPC coding performs a series of iterations on thereceived data in order to remove errors from the received data. Theseiterations consist of two primary steps. The first is called the checknode calculation, where data is read from a memory and some arithmeticoperations are performed. The results are then written back into thememory. The circuit that performs this first operation is the check nodeprocessing unit (CPU). The second step is called the bit nodecalculation where other data is read from memory and additionalarithmetic operations are performed. The results are again written backinto memory. The circuit that performs this operation is called the bitnode processing unit (BPU). Each of these processing units performscomplex calculations that require both processing power and a largelocal register memory storage for intermediate results. Also, the resultof one CPU operation is used in the next BPU operation and vice versa,so each processor is acting on the same data stream segment and only oneprocessing block can operate at a time.

The LDPC codes are constructed so that many of these calculations can bedone in parallel on a long segment of the incoming data stream. Onecurrent implementation consists of 360 parallel calculation block unitsprocessing a bit stream segment that is 64,800 bits long. As mentioned,an LDPC decoder utilizing this type of algorithm must be required toiterate through its process a number of times. Typically a decoderiterates between them as many as 50 or more times in order to determinethe final error corrected values for incoming data. And as mentioned, aCPU operation and a BPU operation can never be done at the same timebecause the BPU operation may depend on changes made during the previousCPU operation and vice versa.

Each of the 360 blocks has the same or similar bit connections relativeto each other. This similarity allows for a decoder architecture thatpermits 360 parallel calculation units. In order to decode the data,there is a circuit block that is responsible for getting 360 pieces ofdata for the calculation units. For each step, this data comes from adifferent set of locations in memory.

Although the performance of LDPC codes may exceed preceding errorcorrection methods, the decoder required for an LDPC code is much largerthan older systems and further requires multiple iterations through itsprocessing path. The error correction performance is determined, andlimited, by the number of iterations that can be made through theprocessing path in a given timeframe, the LDPC frame time. Structurallimitations, including memory allocation, and memory access processingmay place an overall restriction on performance of the decoder.

Increasing the number of iterations that can be performed in an LDPCframe time may directly result in improved decoder performance. Acircuit architecture and method that increases the number of iterationsthat can be performed during an LDPC frame time are therefore desirable.Similarly it is desirable to provide an efficient use of resources, suchas memory, when constructing the decoder in order to save both size andpower, and also increase decoder performance.

SUMMARY OF THE INVENTION

The present invention is directed towards an error correction system ina communications receiver, and is further directed to an efficientapparatus and method for operating an error correction system, such as alow density parity check error correction system.

The apparatus of the present invention includes a main memory forstoring data to be processed, a switch to selecting which of twoprocessors receives data, a first processor for performing firstprocessing on the data, a second processor, for performing secondprocessing on the data and having a forward shift and a reverse shiftoperation to facilitate re-ordering of the data, and a controller forcontrolling the switching, shifting, and memory reading and writingfunctions.

The method of the present invention includes reading data from a firstmemory in an initial configuration, altering the configuration of thedata, processing the altered data, returning the data back to theinitial configuration, and storing the data back in the main memory.

BRIEF DESCRIPTION OF THE DRAWINGS

Advantages of the invention may become apparent upon reading thefollowing detailed description and upon reference to the drawings inwhich:

FIG. 1 is a block diagram of a link circuit of the present invention.

FIG. 2 is a block diagram of one aspect of the present invention.

FIG. 3 is a block diagram of another aspect of the present invention.

FIG. 4 is a flow chart of an exemplary method of one aspect of thepresent invention.

The characteristics and advantages of the present invention may becomemore apparent from the following description, given by way of example.

DETAILED DESCRIPTION

One or more specific embodiments of the present invention will bedescribed below. In an effort to provide a concise description of theseembodiments, not all features of an actual implementation are describedin the specification. It should be appreciated that in the developmentof any such actual implementation, as in any engineering or designproject, numerous implementation-specific decisions must be made toachieve the developers' specific goals, such as compliance withsystem-related and business-related constraints, which may vary from oneimplementation to another. Moreover, it should be appreciated that sucha development effort might be complex and time consuming, but wouldnevertheless be a routine undertaking of design, fabrication, andmanufacture for those of ordinary skill having the benefit of thisdisclosure.

The following describes a circuit used for receiving satellite signals.Other systems utilized to receive other types of signals where thesignal input may be supplied by some other means may include verysimilar structures. Those of ordinary skill in the art will appreciatethat the embodiment of the circuits described herein is merely onepotential embodiment. As such, in alternate embodiments, the componentsof the circuit may be rearranged or omitted, or additional componentsmay be added. For example, with minor modifications, the circuitsdescribed may be configured to for use in non-satellite video and audioservices such as those delivered from a cable network.

Turning now to FIG. 1, an exemplary link circuit 100 of the presentinvention is shown. At the input to the circuit, an analog to digital(A/D) converter 102 is connected to a frequency translator 104. Anoscillator, such as a numerically controlled oscillator (NCO) 106, isalso connected to the frequency translator 104. The output of thefrequency translator 104 is connected to the anti-alias filter 108 andthe anti-alias filter 108 is connected to the automatic gain control(AGC) amp block 110. The output of the AGC amp block 110 is connected toa decimator block 112 that is connected to the symbol timing recoveryblock 114. The symbol timing recovery block 114 is connected to acarrier tracking loop 116 and finally the carrier tracking loop block116 is connected to the error correction block 118. A link processor 220is connected to the NCO 106, the AGC amp block 110, the carrier trackingloop 116, and the link memory 122. For clarity, some connections andblocks may be omitted but one skilled in the art should recognize theseomissions. The operation of each these blocks will be further describedbelow.

The link circuit 100 contains an A/D converter 102 for converting theone or more baseband signals delivered from a tuner, not shown, into adigital signal. The digital signal from A/D converter 102 represents aseries of samples of the one or more baseband signals, where each samplecontains, for instance, a 10 bit word of data. It is important to notethat the preferred embodiment utilizes one or more baseband signals asthe inputs to the A/D converter 102. However, in another embodiment thesignal(s) provided by the tuner as inputs to the A/D converter 102 maybe located at a frequency that is near baseband, or may be located atsome other intermediate frequency (IF).

A clock signal, not shown, is also connected to the A/D converter inorder to produce the series of samples. The clock signal may begenerated from another source such as a crystal and/or also may befurther controlled by link processor 120. In one embodiment, the linkprocessor 120 may determine the clock rate that is necessary for properprocessing of the incoming received signal. In another embodiment, thesampling in the A/D converter 102 may be done at a fixed rate andprocessing, such as decimating the sampled signal down to the propersampling rate, may be done in later blocks.

The digital signal from the A/D converter 102 is supplied to a frequencymixer or frequency translator 104. The frequency translator 104 alsoreceives an input signal supplied from the NCO 106. The NCO 106 andfrequency translator 104 are capable of shifting the incoming digitalsignal with respect to the incoming signal's carrier frequency, therebyproducing a frequency shifted digital signal. The NCO 106 is typically aprogrammable frequency digital signal source. Control for programmingthe digital frequency of the NCO 106 may be generated by the linkprocessor 120. In some embodiments, control may also be determined bythe carrier tracking loop 116, described later, either in conjunctionwith the link processor 120 or separate from it. The operating range ofthe NCO 106 may be specified in terms of its frequency offset adjustmentrange. This range may be determined using a number of factors such asthe incoming digital signal's symbol rate and/or the sampling rate theA/D converter 102 uses for processing the incoming baseband signal. Inone embodiment, the frequency translator block 104 and NCO 106 allows afrequency offset, determined by the carrier tracking loop 116, to beremoved directly in circuitry located in the link circuit 100.Correcting the offset within the link circuit 100 eliminates thepossible re-tuning of the tuner, which may result in additional timedelay that is undesirable to the user.

The output of the frequency translator 104 supplies the frequencyshifted digital signal to the anti-alias filter 108. The anti-aliasfilter 108 is typically a digital filter that is used to remove signalenergy not associated with the desired incoming signal while passing thedesired incoming signal essentially unchanged. Depending upon the rangeof symbol rates of the input signals possible for demodulating in thelink circuit 100, the anti-alias filter 108 may be a set of one or morefixed filters or a programmable filter. In a preferred embodiment,anti-alias filter 108 may be programmed to change its passband frequencyresponse and/or other characteristics. In another embodiment, the filtermay be programmed to match a passband characteristic of the incomingfrequency shifted digital signal. One such passband characteristic maybe signal bandwidth.

The filtered digital signal passes into the AGC amp block 110. The AGCamp block 110 contains a gain controllable digital signal amplifier anda signal detector. The signal detector is used to measure the magnitudeof the signal that is present. The signal detector may typically detectthe total power of the signal, such as the root mean square power, overa time period. The output of the signal detector is connected in a loopas a control signal for the gain controllable digital signal amplifierin a way that the output of the amplifier may be maintained at aconstant level. In addition, the detector within the AGC block 110 maybe used to provide an indication of the incoming signal level. Oneoutput of the detector, the level indicator signal, may be routed to thelink processor 120 for further processing.

The AGC block 110 outputs a gain compensated signal from itscontrollable digital signal amplifier and supplies the gain compensatedsignal to a decimator 112. The decimator 112 reduces the effectivesampling rate by removing samples of the gain compensated signal basedon a comparison of the incoming signal sampling rate and the requiredsample rate for the symbol timing recovery block 114.

The symbol timing recovery block 114 contains a control loop thatadjusts the phase of the incoming decimated signal in order to optimizethe sampling position and allow optimal detection of the symbols of datasent in the incoming signal. The output of the symbol timing recoveryblock 114 then connects to a block containing the carrier tracking loop116. The carrier tracking loop 116 contains a control loop that maydetermine and/or correct the phase and/or frequency of the incomingsignal with respect to an expected or correct carrier frequency. Thecarrier tracking loop 116 may perform the determining and correcting offrequency offset without consideration for the actual values of thesymbols of data in the incoming signal.

It is important to note that symbol timing recovery block 114 andcarrier tracking loop 116 may be operatively coupled with respect toeach other and/or to other blocks in the link circuit 100, as is wellknown to one skilled in the art.

The output of the carrier tracking loop 116, now a demodulated signal,enters the error correction block 118. Typically the error correctionblock 118 may contain a symbol slicer module for determining the actualsymbol values. The error correction block 118 may also contain thesymbol to bit mapper module used to generate the bits, containing dataand error correction bits. Additionally, the error correction block 218contains modules for utilizing the error correction information that hasbeen sent along with the data in the incoming signal. A number of typesof error correction methods may be employed in communications systemssuch as those described herein, as known to those skilled in the art.Some error correction methods may include Reed-Solomon error correction,trellis error correction, or Interleaving. Also, some newer types knownas turbo code error correction and LDPC error correction may also beused. Any of these error correction methods may be used individually ormay be combined to work together, as is known by those skilled in theart.

The following describes aspects of a newer type of error correctiondecoder, known as an LDPC decoder. The decoder may be found in asatellite receiver. Although the present invention is described asoperating in an LDPC decoder, the present invention is not limited onlyto this particular decoder nor is the present invention limited to anLDPC decoder in a satellite receiver.

Referring now to FIG. 2, a portion of one LDPC decoder block, 200, ofthe present invention is described. The LDPC decoder block 200 islocated within the error correction block 118 shown in FIG. 1. It shouldbe noted that FIG. 2 only shows one block unit of an entire LDPC decoderstructure normally containing many identical blocks (e.g., as many as360 identical blocks.) Each block operates in parallel on small sectionsof the data segment to process the entire data segment. For clarity,only the functional elements of the overall structure associated withthe invention are illustrated.

A block memory 202 includes both input and output connections forconnecting to other external processing blocks (i.e. carrier trackingloop and/or interleaver). The block memory 202 also connects to amultiplex switch, the upper mux 204. The upper mux 204, has two outputs,one connecting to a CPU block 206, and the other connecting to a BPUblock 208. The CPU block 206 and BPU block 208 each connect to a secondmultiplex switch, the lower mux 210. The lower mux 210 connects to aCPU/BPU register memory 212. The CPU/BPU register memory 212 connectsback to the block memory 202. A controller 214 is connected to the blockmemory 202, the upper mux, 204, the lower mux 210, the CPU/BPU registermemory 212 and a controller memory 216.

The block memory 202 may be a Random Access Memory (RAM). The initialinput into the block memory 202 may be a data segment containing, forexample, demodulated data from the carrier tracking loop 116. The datasegment contains data bits as well as error correction bits such asthose used in LDPC error correction. Once the local error correction iscomplete, the output of the block memory 202 may contain a data streamsegment representing an LDPC error corrected data stream with the errorcorrection bits associated with LDPC error correction removed. Theoutput of the block memory 202 may connect to another section of errorcorrection, or it may connect to a video, audio, and/or data decoder.

The block memory 202 also supplies requested sections of the datasegment to the CPU block 206 and BPU block 208 through upper mux 304.The upper mux 204 controls whether the section of the data segment issent to the CPU block 206 or BPU block 208. The CPU block 206 and BPUblock 208 are the main processors for processing the LDPC data andperforming the error correction algorithm. As described earlier, theseblocks operate separately from one another, with results of oneprocessing step through BPU block 208 eventually entering the CPU block206 and vice versa. Further, explanation regarding the actual algorithmsemployed by the CPU block 206 and BPU block 208 are beyond the scope ofthis invention but are well known to those skilled in the art.

The lower mux 210 may take the final processed output of the CPU block206 or BPU block 208 and control which output is eventually routed backinto the block memory 202. Both the CPU block 206 and BPU block 208 mayrequire some local memory storage, such as a set of registers, formaintaining intermediate values during their internal algorithmprocessing steps. Typically, the memory will reside in each of theblocks 206 and 208, creating a duplicate set of memory dedicated to eachprocessor. In the present invention, a CPU/BPU register memory 212 maybe located beyond the multiplex switch 210 so that memory for theinternal processing of the CPU block 206 and BPU block 208 may be sharedby both the CPU block 206 and BPU block 208. As a result of sharing thememory space in the CPU/BPU register memory 212, a more efficientoverall memory space utilization can be achieved. The sharing furthermay result in a reduction of total power and an increase of operatingspeed. The lower mux 210 acts as a bidirectional control switch for theflow of intermediate data values between the CPU/BPU register memory 212and the one currently active processing block 206 or 208. After theactive processing block, CPU block 206 or BPU block 208, has completed,the lower mux 210 then routes the final output of the active processingblock back to the block memory 202 through the CPU/BPU register memory212.

The controller 214 provides access control for both the block memory 202and CPU/BPU register memory 212. The controller 212 determines whichbits or sections of the incoming segment should be read from blockmemory 202 for processing in the currently processing cycle and/or whichlocations in block memory 202 to send the results of the currentprocessing cycle. The CPU/BPU register memory 212 also receives accesscontrol and data directly from the CPU block 206 and BPU block 208.Additionally, the controller 214 provides control for the upper mux 204and lower mux 210 based on the current processing mode.

The controller 214 utilizes the controller memory 216 to store the arraylocations for where the data stream elements are located in block memory202. The array locations provide the proper sequence arrangementsbetween the data bits, parity bits, and parity groups utilized withinthe CPU and BPU processing steps. Finally, all of the blocks operateusing one or more clock signals, not shown. The clock signal(s) may begenerated from a source external to the block unit, or generatedinternally. In one embodiment, the processing clock supplied to theblock unit 200 and used by the BPU block 208 and CPU block 206, may be asignal generated as four times the frequency of the clock for the A/Dconverter 102.

Although the block memory 202, controller 212, and controller memory 216are shown as part of the block unit in this embodiment, otherembodiments may contain one block memory, one controller, and onecontroller memory for all the block units collectively. The functionaldescription of the aforementioned blocks would not change. Additionally,a number of functions have not been included in the block diagram andare well known to one skilled in the art. The functions may includeclock circuits, flag indicators, pipeline registers, and other controlfunctions allowing, for instance, interconnection of information betweenblock units.

Normally, the CPU and BPU functions operate more or less independently.The present invention combines some of the circuits, namely the registermemory, used in the CPU and BPU calculation units, 206 and 208, to savecircuitry. In particular, during the calculations done by the CPU andBPU units, 206 and 208, some intermediate results are stored inregisters. The present invention uses a multiplexer, 210, to share theseregisters such that only one register memory, 212, is needed betweenboth operations. A controller 214 is used to manage the sharing process.

The CPU block 206 may require the data to be processed in a mannerdifferent than in the BPU block 208. In one example, the CPU block 208requires the data to be circular shifted with respect to a startingpoint in the data path. Conventionally, the shift is accomplished byusing the CPU block 206 and requires three separate processes. A firstprocess includes reading a data section out of block memory 202,shifting the data in the CPU block 208, and then sending the newlyshifted data section back to the block memory 202. A second processincludes reading this newly shifted data section from block memory 202and sending the newly shifted data to the CPU block 308 for errorcorrection processing. Once the CPU block processing is complete theprocessed shifted data is sent back to the block memory 202. Finally athird process includes reading the processed shifted data from the blockmemory 202, shifting the processed shifted data back (i.e., reversingthe first shift) using the CPU block 208, and sending the newlyunshifted processed data back to the block memory 202. After, the datain block memory 202 is now ready for processing using the BPU block 208.The three processes use extra clock cycles and therefore limit thenumber of iterations that can be accomplished within an LDPC frame time.Much greater efficiency in time can be achieved by providing the shiftoperations in series with the CPU block 306 as will now be described.

Referring now to FIG. 3, a portion of one LDPC decoder block, 300, ofthe present invention is described. The LDPC decoder block 300 islocated within the error correction block 118 shown in FIG. 1. It shouldbe noted that FIG. 3 only shows one block unit of an entire LDPC decoderstructure normally containing many identical blocks (e.g., as many as360 identical blocks.) Each block operates in parallel on small sectionsof the data segment to process the entire data segment. For clarity,only the functional elements of the overall structure associated withthe invention are illustrated.

A block memory 302 includes both input and output connections forconnecting to other external processing blocks (i.e. carrier trackingloop and/or interleaver). The block memory 302 also connects to amultiplex switch, the upper mux 304. The upper mux 304 has two outputs,one connecting to a forward shift block 305 and the other connecting tothe BPU block 308. The forward shift block 305 connects to the CPU block306. The CPU block 306 connects to a reverse shift block 309, and alongwith BPU block 308, each connect to a second multiplex switch, the lowermux 310. The lower mux 310 connects to CPU/BPU register memory block312. The CPU/BPU register memory block 312 connects back to the blockmemory 302. A controller 314 is connected to the block memory 302, theupper mux, 304, the lower mux 310, the forward shift block 504, thereverse shift block 508, and controller memory 316.

The block memory 302 may be a RAM. The initial input into the blockmemory 302 may be a data segment containing, for example, demodulateddata from the carrier tracking loop 216. The data segment contains dataas well as error correction bits such as those used in LDPC errorcorrection. Once the local error correction is complete, the output ofthe block memory 302 may contain a data stream segment representing anLDPC error corrected data stream with the error correction bitsassociated with LDPC error correction removed. The output of the blockmemory 302 may connect to another section of error correction, or it mayconnect to a video/audio/data decoder.

The block memory 302 also supplies requested sections of the datasegment to the CPU block 306 and BPU block 308 through upper mux 304.The upper mux 304 controls whether the section of the data segment issent to the CPU block 306 or BPU block 308. The CPU block 306 and BPUblock 308 are the main processors for processing the LDPC data andperforming the error correction algorithm. As noted earlier, theseblocks operate independently and also in series, with results of one BPUprocess output eventually entering the CPU block 306 and vice versa.Each of the CPU and BPU blocks may also utilize local memory for storingintermediate operations and calculations necessary to process the datastream. Further, explanation regarding the actual algorithms employed bythe CPU block 306 and BPU block 308 are beyond the scope of thisinvention but are well known to those skilled in the art.

The data, after being read from block memory 302, is first shifted inthe forward shift block 305 and then sent to the CPU block 306. Afterthe CPU block 306 has completed the processing, another shift, reverseto the shift performed on the data before entering the CPU block 306, isperformed in the reverse shift block 309. The shift and reverse shiftfunctions permit data to be written back into memory in its properorientation so that the data may be used by BPU block 308. Finally thelower mux 310 provides routing from the output of the BPU block 308, andthe path containing the CPU block 306 and shift blocks 305 and 309 tothe CPU/BPU register memory 312 and also back to the block memory 302.The CPU/BPU register memory 312 may be located beyond the multiplexswitch 310 so that memory for the internal processing of the CPU block306 and BPU block 308 may be shared by both the CPU block 306 and BPUblock 308.

The controller 314 provides access control for both the block memory 302and CPU/BPU register memory 312. The controller 314 determines whichbits or sections of the incoming segment should be read from blockmemory 302 for processing in the currently processing cycle and/or whichlocations in block memory 302 to send the results of the currentprocessing cycle. Additionally, the controller 314 provides control forthe upper mux 304 and lower mux 310 based on the current processingmode. The controller 314 also provides control and data inputs for theforward shift block 305 and reverse shift block 309. In one embodiment,the controller 314 may provide control to the reverse shift block 309 asa control to bypass the reverse shift block during CPU processing,allowing the CPU block 306 to access the CPU/BPU register memory duringintermediate CPU process operations.

The controller 314 utilizes the controller memory 316 to store the arraylocations for where the data stream elements are located in the blockmemory 302. The array locations provide the proper sequence arrangementsbetween the data bits, parity bits, and parity groups utilized withinthe CPU and BPU processing steps. Additionally the controller memory 314may contain the shift settings for each of the data segment sections.These shift settings are then used to program the shift blocks 305 and309 that are located in the CPU processing path. Finally, all of theblocks operate using one or more clock signals not shown. The clocksignal(s) may be generated from a source external to the block unit, orgenerated internally. In one embodiment, the processing clock suppliedto the block unit and used by the BPU block 308 and CPU block 306, maybe a signal generated as four times the frequency of the clock used forthe A/D converter 201 in FIG. 1.

Although the block memory 302, controller 314, and controller memory 316are shown as part of the block unit in this embodiment, otherembodiments may contain one block memory, one controller, and onecontroller memory for all the block units collectively. The functionaldescription of the aforementioned blocks would not change. Additionallya number of functions have not been included in the block diagram andare well known to one skilled in the art. The functions may includeclock circuits, flag indicators, pipeline registers, and other controlfunctions allowing, for instance, interconnection of information betweenblock units.

As described earlier, performance of the LPDC error correction method isrelated to the number of iterations that the error correction block canperform on a data segment. The number of clock cycles available forprocessing is therefore very important. By utilizing a serialarrangement to implement the necessary shift operations within only theCPU processing path, a greater number of iterations in the errorcorrection method may be achieved.

Referring now to FIG. 4, a flow chart, illustrating a method, 400, isshown. The flow chart describes a process first processing the datathrough the CPU path, followed by processing the data through the BPUpath of the LDPC block in FIG. 3. In another embodiment, the processcould begin with the BPU path and end with the CPU path. Also theprocess may continue through as many iterations as are needed to achievethe desired performance level or until the time frame for processing hasended.

First, at step 402, a section of data is read from the block memory 302.The data may then be passed towards the forward shift block 305 where,at step 404, the forward shift block 305 performs a shift of the datasection. The shift in one preferred embodiment may be a circular shiftof the section of data. Next, at step 406, calculations are performed inthe CPU block 306. Once the calculations are complete in the CPU block306, the processed data, at step 408, is reverse shifted in the reverseshift block 309. Next, at step 410, the data is written back into theblock memory 302. At this step, the CPU processing for the iterationcycle has been completed.

Next, at step 412, the section of data is again read out of the blockmemory 302. The data is passed to the BPU block 308 where, at step 414,the BPU block 308 processes the data. Finally, at step 416, aftercompletion of the processing in the BPU block 308, the data is writtenback into the block memory 302. At this point, one complete iterationthrough the process has been completed. As mentioned earlier, theprocess may iterate multiple times in order to improve performance ofthe decoder output.

As illustrated, each step in the process shown in FIG. 4 does notrepresent one cycle of the processing clock. In fact many steps may becombined onto the same clock cycle by using pipelining structures notillustrated but well known to one skilled in the art. For instance, thesteps of reading a set of data, shifting a set of data, processing a setof data in the CPU block, and shifting a set of data back may occursimultaneously on the same clock cycle, given that each set of data isdifferent and consecutive in processing sequence. In most instances, itis necessary that a memory read and write occur on different clockcycles.

The processing method and architectures permit an efficient approach toprocessing data in an LDPC decoder, allowing a maximal amount of processiterations within an LDPC frame time. Although aspects of the inventionhave been described separately, it is also possible and expected thatthe aspects can be combined to gain full advantage of the performanceprovided by each aspect of the invention.

While the invention may be susceptible to various modifications andalternative forms, specific embodiments have been shown by way ofexample in the drawings and will be described in detail herein. However,it should be understood that the invention is not intended to be limitedto the particular forms disclosed. Rather, the invention is to cover allmodifications, equivalents and alternatives falling within the spiritand scope of the invention as defined by the following appended claims.

1. An apparatus for performing error correction comprising: a firstmemory for storing data to be processed; a first switch coupled to saidfirst memory and having a first and second output; a first processorcoupled to said first output of said switch; a second processor coupledto said second output of said switch, said second processor having aforward shift and a reverse shift operation to facilitate a re-orderingof said data; and a controller coupled to said first switch and saidsecond processor for switchably controlling said first switch.
 2. Theapparatus of claim 1 further comprising a second memory switchablycoupled to said first processor and said second processor and having anoutput coupled to said first memory, wherein said second memory storesintermediate results of said first processor and said second processor.3. The apparatus of claim 2 wherein said second memory is included toprovide a reduction in memory space used.
 4. The apparatus of claim 1,wherein said apparatus is used for low density parity check code errorcorrection
 5. The apparatus of claim 1 wherein said first processor isused for bit node processing
 6. The apparatus of claim 1 wherein saidsecond processor is used for check node processing
 7. The apparatus ofclaim 1 further comprising a third memory coupled to said controller forstoring control and access information.
 8. The apparatus of claim 1wherein said apparatus is used for receiving DVB-S2 signals in asatellite receiver.
 9. The apparatus of claim 1 wherein said controllerfurther controls said forward shift operation and said reverse shiftoperation of said second processor.
 10. The apparatus of claim 1 whereinsaid first memory is a RAM.
 11. The apparatus of claim 1 wherein saidfirst shift and said second shift are externally and serially coupled tosaid second processor.
 12. The apparatus of claim 1 wherein said firstshift and said second shift increase a number of process iterations in atime frame.
 13. A method for performing error correction comprising thesteps of: reading data from a first memory, said data being arranged inan initial configuration; altering the configuration of said data;processing said altered data using a first process; returning saidprocessed data to said initial configuration; and writing said processeddata into said first memory.
 14. The method of claim 13 furthercomprising: reading said processed data from said first memory;reprocessing said data in a second process; and writing said reprocesseddata back into said first memory.
 15. The method of claim 14, whereinsaid second process is a bit node process.
 16. The method of claim 13wherein the method is used in an LDPC decoder.
 17. The method of claim13, wherein said first process is a check node process.
 18. The methodof claim 13, wherein said steps of altering and returning involveforward shifting and reverse shifting respectively.
 19. An apparatuscomprising: a means for reading data from a first memory, said databeing arranged in an initial configuration; a means for altering theconfiguration of said data; a means for processing said altered datausing a first process; a means for returning said first processed datato said initial configuration; and a means for writing said processeddata into said first memory.